Goto

Collaborating Authors

 random fourier feature


A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent

Neural Information Processing Systems

This article characterizes the exact asymptotics of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$, their dimension $p$, and the dimension of feature space $N$ are all large and comparable. In this regime, the random RFF Gram matrix no longer converges to the well-known limiting Gaussian kernel matrix (as it does when $N \to \infty$ alone), but it still has a tractable behavior that is captured by our analysis. This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$. Based on these estimates, a precise characterization of two qualitatively different phases of learning, including the phase transition between them, is provided; and the corresponding double descent test error curve is derived from this phase transition behavior. These results do not depend on strong assumptions on the data distribution, and they perfectly match empirical results on real-world data sets.


Cluster-Based Generalized Additive Models Informed by Random Fourier Features

Huang, Xin, Li, Jia, Yu, Jun

arXiv.org Machine Learning

Explainable machine learning aims to strike a balance between prediction accuracy and model transparency, particularly in settings where black-box predictive models, such as deep neural networks or kernel-based methods, achieve strong empirical performance but remain difficult to interpret. This work introduces a mixture of generalized additive models (GAMs) in which random Fourier feature (RFF) representations are leveraged to uncover locally adaptive structure in the data. In the proposed method, an RFF-based embedding is first learned and then compressed via principal component analysis. The resulting low-dimensional representations are used to perform soft clustering of the data through a Gaussian mixture model. These cluster assignments are then applied to construct a mixture-of-GAMs framework, where each local GAM captures nonlinear effects through interpretable univariate smooth functions. Numerical experiments on real-world regression benchmarks, including the California Housing, NASA Airfoil Self-Noise, and Bike Sharing datasets, demonstrate improved predictive performance relative to classical interpretable models. Overall, this construction provides a principled approach for integrating representation learning with transparent statistical modeling.


Primal: A Unified Deterministic Framework for Quasi-Orthogonal Hashing and Manifold Learning

Khasia, Vladimer

arXiv.org Artificial Intelligence

We present Primal, a deterministic feature mapping framework that harnesses the number-theoretic independence of prime square roots to construct robust, tunable vector representations. Diverging from standard stochastic projections (e.g., Random Fourier Features), our method exploits the Besicovitch property to create irrational frequency modulations that guarantee infinite non-repeating phase trajectories. We formalize two distinct algorithmic variants: (1) StaticPrime, a sequence generation method that produces temporal position encodings empirically approaching the theoretical Welch bound for quasi-orthogonality; and (2) DynamicPrime, a tunable projection layer for input-dependent feature mapping. A central novelty of the dynamic framework is its ability to unify two disparate mathematical utility classes through a single scaling parameter σ. In the low-frequency regime, the method acts as an isometric kernel map, effectively linearizing non-convex geometries (e.g., spirals) to enable high-fidelity signal reconstruction and compressive sensing. Conversely, the high-frequency regime induces chaotic phase wrapping, transforming the projection into a maximum-entropy one-way hash suitable for Hyperdimensional Computing and privacy-preserving Split Learning. Empirical evaluations demonstrate that our framework yields superior orthogonality retention and distribution tightness compared to normalized Gaussian baselines, establishing it as a computationally efficient, mathematically rigorous alternative to random matrix projections. The code is available at https://github.com/VladimerKhasia/primal





Sampled Softmax with Random Fourier Features

Ankit Singh Rawat, Jiecao Chen, Felix Xinnan X. Yu, Ananda Theertha Suresh, Sanjiv Kumar

Neural Information Processing Systems

Motivated by our analysis and the work on kernel-based sampling, we propose the Random F ourier Softmax (RF-softmax) method that utilizes the powerful Random Fourier Features to enable more efficient and accurate sampling from an approximate softmax distribution. We show that RF-softmax leads to low bias in estimation in terms of both the full softmax distribution and the full softmax gradient.



Supplementary Material Spectrum Gaussian Processes Learning from Noisy and Sparse Data A Derivation of the spectral representation

Neural Information Processing Systems

The ELBO is derived from Jensen's inequality as follows: log p ( Y) ZZZ q ( X, f, w) log p ( Y, X, f, w) q ( X, f, w) d w d f d X (31) = ZZZ p ( f | w) q ( w) The inference procedure of SSGP is shown in Algorithm 1. In the experiments, we set the integration time window =1 . Update the parameters by maximizing the ELBO (13) evaluated using D . In this appendix, we describe baseline models for the experiments in Section 6. D-SymODEN can also apply to the dissipative systems. SympGPR can estimate conservative vector fields from derivative observations by considering Hamiltonian mechanics; we used finite differences for training.


Appendix: Structure-Aware Random Fourier Kernel for Graphs Jinyuan Fang

Neural Information Processing Systems

F ourier transform of a positive finite measure. Hence the RFF is limited to a small range of simple kernel functions such as RBF kernel. KL divergence between the variational distribution and the true posterior over latent variable A . 2 Algorithm 1 The proposed GPSRF approach for semi-supervised object classification task. We provide an overview of the optimization process of GPSRF for object classification in Algorithm 1. ELBO defined in Eq. (7). The features of each node are bag-of-word representations of the corresponding publications.